Comparison of Supervised and Semi- Supervised Fuzzy Clusters in Text Categorization

نویسنده

  • Mohammed Abdul Wajeed
چکیده

Electronics gadgets are part of human life in these days, as a result abundant data is generated and it is growing in exponential rate. Data Generated was earlier stored in dumped repositories. The paper attempts in proposing a classified repository so that at later retrieval of stored data or navigation becomes easy. In the present paper comparison between supervised and semi-supervised classification is explored. Semi-Supervised Classification is half way between the supervised and unsupervised paradigm which can be employed where we have a limited amount of training data. In the process of feature reduction, the features which are words in Text Classification, clusters are formed, that have fuzzy concepts. Three different types of clusters are formed namely soft, hard and mixed and different similarity measures are employed in the process of text classification, the results obtained are encouraging.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Categorization using the Semi-Supervised Fuzzy c-Means Algorithm

Text Categorization (TC) is the automated assignment of text documents to predefined categories based on document contents. For the past few years, TC has become very important essentially in the Information Retrieval area, where information needs have tremendously increased with the rapid growth of textual information sources such as the Internet. In this paper, we compare , for text categoriz...

متن کامل

A fuzzy semi-supervised support vector machine approach to hypertext categorization

Hypertext/text domains are characterized by several tens or hundreds of thousands of features. This represents a challenge for supervised learning algorithms which have to learn accurate classifiers using a small set of available training examples. In this paper, a fuzzy semi-supervised support vector machines (FSS-SVM) algorithm is proposed. It tries to overcome the need for a large labelled t...

متن کامل

A Fuzzy Semi-Supervised Support Vector Machines Approach to Hypertext Categorization

Hypertext/text domains are characterized by several tens or hundreds of thousands of features. This represents a challenge for supervised learning algorithms which have to learn accurate classifiers using a small set of available training examples. In this paper, a fuzzy semi-supervised support vector machines (FSS-SVM) algorithm is proposed. It tries to overcome the need for a large labelled t...

متن کامل

Approach to Hypertext Categorization

Hypertext/text domains are characterized by several tens or hundreds of thousands of features. This represents a challenge for supervised learning algorithms which have to learn accurate classifiers using a small set of available training examples. In this paper, a fuzzy semi-supervised support vector machines (FSS-SVM) algorithm is proposed. It tries to overcome the need for a large labelled t...

متن کامل

Semi-supervised Collaborative Text Classification

Most text categorization methods require text content of documents that is often difficult to obtain. We consider “Collaborative Text Categorization”, where each document is represented by the feedback from a large number of users. Our study focuses on the semisupervised case in which one key challenge is that a significant number of users have not rated any labeled document. To address this pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012